AITopics | decision tree algorithm

Harnessing the Power of Choices in Decision Tree Learning

Neural Information Processing SystemsApr-30-2026, 10:23:10 GMT

We propose a simple generalization of standard and empirically successful decision tree learning algorithms such as ID3, C4.5, and CART. These algorithms, which have been central to machine learning for decades, are greedy in nature: they grow a decision tree by iteratively splitting on the best attribute. Our algorithm, Top-k, considers the k best attributes as possible splits instead of just the single best attribute.We demonstrate, theoretically and empirically, the power of this simple generalization. We first prove a greediness hierarchy theorem showing that for every k N, Top-(k +1) can be dramatically more powerful than Top-k: there are data distributions for which the former achieves accuracy 1 ε, whereas the latter only achieves accuracy 12 +ε. We then show, through extensive experiments, that Top-k outperforms the two main approaches to decision tree learning: classic greedy algorithms and more recent "optimal decision tree" algorithms. On one hand, Top-k consistently enjoys significant accuracy gains over greedy algorithms across a wide range of benchmarks. On the other hand, Top-k is markedly more scalable than optimal decision tree algorithms and is able to handle dataset and feature set sizes that remain far beyond the reach of these algorithms.

algorithm, artificial intelligence, machine learning, (20 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.28)

Industry:

Health & Medicine (0.46)
Education (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

A Communication-Efficient Parallel Algorithm for Decision Tree

Qi Meng, Guolin Ke, Taifeng Wang, Wei Chen, Qiwei Ye, Zhi-Ming Ma, Tie-Yan Liu

Neural Information Processing SystemsMar-23-2026, 01:57:49 GMT

Decision tree (and its extensions such as Gradient Boosting Decision Trees and Random Forest) is a widely used machine learning algorithm, due to its practical effectiveness and model interpretability. With the emergence of big data, there is an increasing need to parallelize the training process of decision tree. However, most existing attempts along this line suffer from high communication costs. In this paper, we propose a new algorithm, called Parallel Voting Decision Tree (PV-Tree), to tackle this challenge. After partitioning the training data onto a number of (e.g., M) machines, this algorithm performs both local voting and global voting in each iteration.

algorithm, artificial intelligence, machine learning, (19 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Harnessing the Power of Choices in Decision Tree Learning Guy Blanc

Neural Information Processing SystemsFeb-18-2026, 03:41:14 GMT

Decision trees are a fundamental workhorse in machine learning. Their logical and hierarchical structure makes them easy to understand and their predictions easy to explain.

algorithm, artificial intelligence, machine learning, (20 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(3 more...)

Industry: Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

fddad60891bdf85aac8041f80ed022df-Paper-Conference.pdf

Neural Information Processing SystemsFeb-18-2026, 03:41:11 GMT

algorithm, artificial intelligence, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > Santa Clara County > Palo Alto (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(3 more...)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.97)

Add feedback

Optimal Sparse Decision Trees

Neural Information Processing SystemsFeb-6-2026, 11:48:28 GMT

Decision tree algorithms have been among the most popular algorithms for interpretable (transparent) machine learning since the early 1980's. The problem that has plagued decision tree algorithms since their inception is their lack of optimality, or lack of guarantees of closeness to optimality: decision tree algorithms are often greedy or myopic, and sometimes produce unquestionably suboptimal models. Hardness of decision tree optimization is both a theoretical and practical obstacle, and even careful mathematical programming approaches have not been able to solve these problems efficiently. This work introduces the first practical algorithm for optimal decision trees for binary variables. The algorithm is a co-design of analytical bounds that reduce the search space and modern systems techniques, including data structures and a custom bit-vector library. We highlight possible steps to improving the scalability and speed of future generations of this algorithm based on insights from our theory and experiments.

algorithm, artificial intelligence, machine learning, (7 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Foundational theory for optimal decision tree problems. II. Optimal hypersurface decision tree algorithm

He, Xi

arXiv.org Artificial IntelligenceOct-28-2025

Decision trees are a ubiquitous model for classification and regression tasks due to their interpretability and efficiency. However, solving the optimal decision tree (ODT) problem remains a challenging combinatorial optimization task. Even for the simplest splitting rules--axis-parallel hyperplanes--it is NP-hard to optimize. In Part I of this series, we rigorously defined the proper decision tree model through four axioms and, based on these, introduced four formal definitions of the ODT problem. From these definitions, we derived four generic algorithms capable of solving ODT problems for arbitrary decision trees satisfying the axioms. We also analyzed the combinatorial geometric properties of hypersurfaces, showing that decision trees defined by polynomial hypersurface splitting rules satisfy the proper axioms that we proposed. In this second paper (Part II) of this two-part series, building on the algorithmic and geometric foundations established in Part I, we introduce the first hypersurface decision tree (HODT) algorithm. To the best of our knowledge, existing optimal decision tree methods are, to date, limited to hyperplane splitting rules--a special case of hypersurfaces--and rely on general-purpose solvers. In contrast, our HODT algorithm addresses the general hypersurface decision tree model without requiring external solvers. Using synthetic datasets generated from ground-truth hyperplane decision trees, we vary tree size, data size, dimensionality, and label and feature noise. Results showing that our algorithm recovers the ground truth more accurately than axis-parallel trees and exhibits greater robustness to noise. We also analyzed generalization performance across 30 real-world datasets, showing that HODT can achieve up to 30% higher accuracy than the state-of-the-art optimal axis-parallel decision tree algorithm when tree complexity is properly controlled.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2509.12057

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

fddad60891bdf85aac8041f80ed022df-Paper-Conference.pdf

Neural Information Processing SystemsOct-9-2025, 12:45:23 GMT

algorithm, artificial intelligence, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > Santa Clara County > Palo Alto (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(3 more...)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.97)

Add feedback

A Review and Analysis of a Parallel Approach for Decision Tree Learning from Large Data Streams

Shiralizadeh, Zeinab

arXiv.org Artificial IntelligenceMay-20-2025

This work studies one of the parallel decision tree learning algorithms, pdsCART, designed for scalable and efficient data analysis. The method incorporates three core capabilities. First, it supports real-time learning from data streams, allowing trees to be constructed incrementally. Second, it enables parallel processing of high-volume streaming data, making it well-suited for large-scale applications. Third, the algorithm integrates seamlessly into the MapReduce framework, ensuring compatibility with distributed computing environments. In what follows, we present the algorithm's key components along with results highlighting its performance and scalability.

algorithm, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2505.1178

Genre: Research Report (0.64)

Industry: Information Technology (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Identification of Hardware Trojan Locations in Gate-Level Netlist using Nearest Neighbour Approach integrated with Machine Learning Technique

Chattopadhyay, Anindita, Bisariya, Siddharth, Sutrakar, Vijay Kumar

arXiv.org Artificial IntelligenceJan-18-2025

In the evolving landscape of integrated circuit design, detecting Hardware Trojans (HTs) within a multi entity based design cycle presents significant challenges. This research proposes an innovative machine learning-based methodology for identifying malicious logic gates in gate-level netlists. By focusing on path retrace algorithms. The methodology is validated across three distinct cases, each employing different machine learning models to classify HTs. Case I utilizes a decision tree algorithm for node-to-node comparisons, significantly improving detection accuracy through the integration of Principal Component Analysis (PCA). Case II introduces a graph-to-graph classification using a Graph Neural Network (GNN) model, enabling the differentiation between normal and Trojan-infected circuit designs. Case III applies GNN-based node classification to identify individual compromised nodes and its location. Additionally, nearest neighbor (NN) method has been combined with GNN graph-to-graph in Case II and GNN node-to-node in Case III. Despite the potential of GNN model graph-to-graph classification, NN approach demonstrated superior performance, with the first nearest neighbor (1st NN) achieving 73.2% accuracy and the second nearest neighbor (2nd NN) method reaching 97.7%. In comparison, the GNN model achieved an accuracy of 62.8%. Similarly, GNN model node-to-node classification, NN approach demonstrated superior performance, with the 1st NN achieving 93% accuracy and the 2nd NN method reaching 97.7%. In comparison, the GNN model achieved an accuracy of 79.8%. However, higher and higher NN will lead to large code coverage for the identification of HTs.

artificial intelligence, machine learning, node, (19 more...)

arXiv.org Artificial Intelligence

2501.16347

Country:

Asia > India > Karnataka > Bengaluru (0.04)
Asia > Middle East > Iran > Tehran Province > Tehran (0.04)
North America > United States > California > Los Angeles County > Pasadena (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry:

Information Technology (0.93)
Semiconductors & Electronics (0.89)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Optimal Sparse Decision Trees

Neural Information Processing SystemsOct-10-2024, 17:29:06 GMT

Decision tree algorithms have been among the most popular algorithms for interpretable (transparent) machine learning since the early 1980's. The problem that has plagued decision tree algorithms since their inception is their lack of optimality, or lack of guarantees of closeness to optimality: decision tree algorithms are often greedy or myopic, and sometimes produce unquestionably suboptimal models. Hardness of decision tree optimization is both a theoretical and practical obstacle, and even careful mathematical programming approaches have not been able to solve these problems efficiently. This work introduces the first practical algorithm for optimal decision trees for binary variables. The algorithm is a co-design of analytical bounds that reduce the search space and modern systems techniques, including data structures and a custom bit-vector library.

algorithm, optimal sparse decision tree, tree algorithm, (2 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Filters

Collaborating Authors

decision tree algorithm

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Harnessing the Power of Choices in Decision Tree Learning

A Communication-Efficient Parallel Algorithm for Decision Tree

Harnessing the Power of Choices in Decision Tree Learning Guy Blanc

fddad60891bdf85aac8041f80ed022df-Paper-Conference.pdf

Optimal Sparse Decision Trees

Foundational theory for optimal decision tree problems. II. Optimal hypersurface decision tree algorithm

fddad60891bdf85aac8041f80ed022df-Paper-Conference.pdf

A Review and Analysis of a Parallel Approach for Decision Tree Learning from Large Data Streams

Identification of Hardware Trojan Locations in Gate-Level Netlist using Nearest Neighbour Approach integrated with Machine Learning Technique

Optimal Sparse Decision Trees